An aggregation algorithm using a multidimensional file in multidimensional OLAP

نویسندگان

  • Young-Koo Lee
  • Kyu-Young Whang
  • Yang-Sae Moon
  • Il-Yeol Song
چکیده

Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional files adapting to skewed distributions. In these multidimensional files, the sizes of page regions vary according to the data density in these regions, and the pages that belong to a larger region are accessed multiple times while computing aggregations. To solve this problem, we first present an aggregation computation model that uses the new notions of disjoint-inclusive partition and induced space filling curves. Based on this model, we then present a dynamic aggregation algorithm. Using these notions, the algorithm allows us to maximize the effectiveness of the buffer––we control the page access order in such a way that a page being accessed can reside in the buffer until the next access. We have conducted experiments to show the effectiveness of our approach. Experimental results for a real data set show that the algorithm reduces the number of disk accesses by Information Sciences 152 (2003) 121–138 www.elsevier.com/locate/ins Corresponding author. Fax: +82-42-869-3510. E-mail addresses: [email protected] (Y.-K. Lee), [email protected] (K.-Y. Whang), [email protected] (Y.-S. Moon), [email protected] (I.-Y. Song). 0020-0255/03/$ see front matter 2003 Published by Elsevier Science Inc. doi:10.1016/S0020-0255(03)00077-X up to 5.09 times compared with a naive algorithm. The results further show that the algorithm achieves a near optimal performance (i.e., normalized I/O1⁄4 1.01) with the total main memory (needed for the buffer and the result table) less than 1.0% of the database size. We believe our work also provides an excellent formal basis for investigating further issues in computing aggregations in MOLAP. 2003 Published by Elsevier Science Inc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aggregation Algorithms for Very Large Compressed Data Warehouses

Many efficient algorithms to compute multidimensional aggregation and Cube for relational OLAP have been developed. However, to our knowledge, there is nothing to date in the literature on aggregation algorithms on compressed data warehouses for multidimensional OLAP. This paper presents a set of aggregation algorithms on very large compressed data warehouses for multidimensional OLAP. These al...

متن کامل

A One-Pass Aggregation Algorithm with the Optimal Buffer Size in Multidimensional OLAP

Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional...

متن کامل

Efficient Aggregation Algorithms for Compressed Data Warehouses

ÐAggregation and cube are important operations for online analytical processing (OLAP). Many efficient algorithms to compute aggregation and cube for relational OLAP have been developed. Some work has been done on efficiently computing cube for multidimensional data warehouses that store data sets in multidimensional arrays rather than in tables. However, to our knowledge, there is nothing to d...

متن کامل

Representation of Aggregation Knowledge in OLAP Systems

Decision support systems are mainly based on multidimensional modeling. Using On-Line Analytical Processing (OLAP) tools, decision makers navigate through and analyze multidimensional data. Typically, users need to analyze data at different aggregation levels, using OLAP operators such as roll-up and drill-down. Roll-up operators decrease the details of the measure, aggregating it along the dim...

متن کامل

Open Archive Toulouse Archive Ouverte (oatao) Multidimensional Database Modelling with Differentiated Multiple Aggregations

OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. Multidimensional database modelling with differentiated multiple aggregations. Many solutions have been defined for multidimensional database modelling. These propositions consider the same aggregation function to determine the values of an indicator accor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Sci.

دوره 152  شماره 

صفحات  -

تاریخ انتشار 2003